Feature: Google Drive - Presentation to Markdown / Audio|Video Transcribed (10MB Limit)#3075
Feature: Google Drive - Presentation to Markdown / Audio|Video Transcribed (10MB Limit)#3075that-dom wants to merge 34 commits intoelastic:mainfrom
Conversation
…#2692) Co-authored-by: Sean Story <sean.j.story@gmail.com>
…e id and adding support for views (elastic#2681) (elastic#2688) Co-authored-by: parthpuri-elastic <150776158+parthpuri-elastic@users.noreply.github.com>
…tic#2699) (elastic#2700) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
elastic#2713) Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
…elastic#2715) Co-authored-by: Artem Shelkovnikov <artem.shelkovnikov@elastic.co>
…tic#2721) (elastic#2724) Co-authored-by: elastic-renovate-prod[bot] <174716857+elastic-renovate-prod[bot]@users.noreply.github.com> Co-authored-by: Giannicola Olivadoti <giannicola.olivadoti@elastic.co> Co-authored-by: Artem Shelkovnikov <artem.shelkovnikov@elastic.co>
Co-authored-by: Artem Shelkovnikov <artem.shelkovnikov@elastic.co>
…lastic#2737) Co-authored-by: parthpuri-elastic <150776158+parthpuri-elastic@users.noreply.github.com>
Co-authored-by: Sean Story <sean.j.story@gmail.com>
…tic#2856) Co-authored-by: Sean Story <sean.j.story@gmail.com>
elastic#2871) Co-authored-by: Sean Story <sean.j.story@gmail.com>
…elastic#2881) (elastic#2900) Co-authored-by: Artem Shelkovnikov <artem.shelkovnikov@elastic.co>
…) (elastic#2973) Co-authored-by: Sean Story <sean.j.story@gmail.com> Co-authored-by: Artem Shelkovnikov <artem.shelkovnikov@elastic.co>
…nner (elastic#2984) (elastic#2988) Co-authored-by: Jedr Blaszyk <jedrazb@gmail.com>
Added the ability to send presentations to azure openai gpt-4o-mini for conversation to markdown Added the ability for audio/video files under 25MB to be sent to azure open whisper model for output to text.
seanstory
left a comment
There was a problem hiding this comment.
Looks like you may need to rebase and force-push - you're pulling in a lot of diff that is not yours.
Also, it looks like you're tightly coupling the Google Drive connector to an Azure AI tool. Which doesn't make a lot of sense to me.
I don't see this being mergable. If you'd like to have a chat with the team in #connectors-feedback, we can discuss how we can work to better integrate with LLM tools, but this diff as I understand it does not align with our vision for the architecture.
artem-shelkovnikov
left a comment
There was a problem hiding this comment.
+1 to Sean's points.
If we want similar functionallity, specific connectors are not the place to do it. Should be either supported framework-wise, or on later stages such as ingest pipelines.
|
Thanks for the review, and I agree with all points. I am glad to see that this is kicking a more considerable discussion because markdown and text are the future for any GenAI use cases. 😄 |
Closes #3073 & #3074
In the Google Drive source this adds the functionality to convert Presentation to Markdown using Azure's OpenAI models and using Whisper service to transcribe audio or video to text with the limit of 10MB.
Checklists
Pre-Review Checklist
config.yml.example)v7.13.2,v7.14.0,v8.0.0)Changes Requiring Extra Attention
This adds external services for Azure hosted OpenAI GPT4-mini and Whisper API's
Related Pull Requests
Release Note